Apache Pig

Apache Pig is a software framework which offers a run-time environment for execution of MapReduce jobs on a Hadoop Cluster via a high-level scripting language called Pig Latin. The following are a few highlights of this project:

  • Pig is an abstraction (high level programming language) on top of a Hadoop cluster.
  • Pig Latin queries/commands are compiled into one or more MapReduce jobs and then executed on a Hadoop cluster.
  • Just like a real pig can eat almost anything, Apache Pig can operate on almost any kind of data.
  • Hadoop offers a shell called Grunt Shell for executing Pig commands.
  • DUMP and STORE are two of the most common commands in Pig. DUMP displays the results to screen and STORE stores the results to HDFS.
  • Pig offers various built-in operators, functions and other constructs for performing many common operations.

Additional Information: Home Page | Wiki | Documentation/User Guide/Reference Manual | Mailing Lists

results matching ""

    No results matching ""